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SUMMARY 


/«. , 


'  This  Report  describes  a  series  of  computer  programs  which  allows  drawings  of  com¬ 
plex  chemical  structures  to  be  displayed  on  a  graphics  terminal,  to  be  modified  using  the 
graphical  input  devices  and  finally,  to  be  drawn  by  an  incremental  plotter.  The  system 
thus  provides  a  means  for  the  interactive  development  of  chemical  structure  diagrams  and 
for  the  production  of  high  quality  drawings  suitable  for  inclusion  in  published  reports. 

The  system  is  based  on  the  graphical  definition  of  several  hundred  chemical  groups. 
The  structure  of  more  complex  compounds  can  be  built  up  from  these  basic  units  and  dis¬ 
played  to  the  user.  Optional  features  of  the  system  include  variation  of  the  scale  of 
drawing,  interactive  modification  of  the  drawings  using  a  light  pen  and  automatic  detec¬ 
tion  and  prevention  of  drawing  overlaps. 
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1  INTRODUCTION 

This  Report  describes  a  sequence  of  computer  programs  which  allows  drawings  of  com¬ 
plex  chemical  structures  to  be  displayed  on  a  graphics  terminal,  to  be  modified  using  the 
graphical  input  devices  and  finally  to  be  drawn  by  an  incremental  plotter.  The  system 
thus  provides  a  means  for  the  interactive  development  of  chemical  structure  diagrams  and 
for  the  production  of  high  quality  drawings  suitable  for  inclusion  in  published  reports. 

The  programs  were  developed  in  response  to  a  request  from  Materials  Department  at 
RAE,  Farnborough.  The  Polymer  Chemistry  Section  has  developed  a  software  system  which 
examines  the  properties  of  a  set  of  polymers  and  calculates  overall  property  coefficients 
for  the  constituent  chemical  groups.  These  coefficients  can  then  be  used  to  predict  the 
bulk  properties  of  chemical  combinations  of  the  groups  and,  in  theory,  to  predict  which 
combinations  of  groups  would  meet  a  particular  requirement.  The  most  important  polymer 
property  in  this  context  is  the  glass  transition  temperature  which  is  the  temperature 
at  which  a  polymer  changes  from  the  glassy  to  the  rubbery  state. 

The  analysis  programs  operate  on  a  collection  of  approximately  350  groups  and  these 
form  the  basic  building  blocks  of  the  system.  Each  group  is  allocated  a  unique  group 
number.  A  higher  level  group  can  be  constructed  from  three  or  more  basic  groups  and  this 
group  also  has  an  identity  number.  These  identity  numbers  are  not  yet  universally  stand¬ 
ardised  and  so  it  is  necessary  to  publish  not  only  the  results  of  the  calculations,  but 
also  a  chemical  representation  of  the  higher  level  groups  which  is  readily  intelligible 
and  universally  understood,  ic.  i  chemical  drawing.  Since  the  results  for  many  thousands 
of  groups  and  polymers  need  to  be  published,  the  work  involved  in  the  production  of  the 
drawings  by  hand  would  be  prodigious.  This  Report  describes  a  means  of  automating  the 
drawing  process. 

Examples  of  drawings  produced  by  the  programs  are  shown  in  Figs  7  to  )3. 

2  AN  OVERVIEW  OF  THE  STRUCTURE  DISPLAY  PROGRAM 

Within  this  Report,  a  basic  group  is  referred  to  as  a  shape,  since  it  is  the  basic 
unit  of  graphical  manipulation,  and  the  more  complex  groups  and  polymers  are  referred  to 
as  structures. 

The  structure  display  program  was  designed  to  operate  using  the  same  input  data 
as  is  presented  to  the  analysis  programs.  The  data  definition  of  a  structure  consists 
of  a  series  of  shape  numbers  structured  so  that  the  data  not  only  defines  the  constituents 
of  the  structure,  but  also  how  the  constituents  are  linked  together  to  form  chains  and 
how  chains  are  connected  together  to  form  more  complex  chemical  structures.  Since  the 
input  data  representation  was  fixed,  the  tasks  involved  in  system  development  were: 

(1)  to  define  a  data  convention  which  could  be  used  to  represent  each  of  the 

shapes  in  graphical  terms  (see  section  4); 

(2)  to  represent  all  of  the  shapes  in  this  data  format; 

(3)  to  define  the  topological  rules  by  which  shapes  are  connected  into  chains  and  by 

which  chains  are  connected  together  to  form  structures  (see  section  7); 
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(4)  To  write  a  program  to  interpret  the  input  data  and  the  graphical  data,  to 

implement  the  connection  rules  and  to  produce  the  chemical  drawings. 

Each  shape  is  defined  graphically  as  a  combination  of  lines,  strings  cf  full  sized 
characters  and  strings  of  half  sized  characters  (see  Fig  I).  Shapes  are  connected 
together  by  a  straight  line  known  as  a  bond  and  one  shape  can  have  several  bonds.  Each 
shape  definition  contains,  for  each  bond,  the  coordinates  of  the  end  of  the  bond  nearest 
its  parent  shape  together  with  the  angle  between  the  bond  and  the  horizontal.  Bonds  are 
normally  of  fixed  length.  Since  a  bond  in  one  shape  must  connect  in  a  straight  line  to 
a  bond  in  the  next  shape,  the  program  must  be  able  to  shift  and  rotate  the  shapes  in  order 
to  match  the  bonds  in  position  and  in  angle  (see  Fig  2). 

The  task  of  constructing  the  data  file  containing  the  graphical  definition  of  all 
the  shapes  was  undertaken  jointly  by  RAE  and  by  The  Rubber  and  Plastics  Research 
Association.  This  file  is  known  as  the  shape  definition  file. 

There  are  three  programs  in  the  complete  chemical  drawing  system.  The  shape 
definition  program  preprocesses  the  shape  definition  data  into  a  readily  accessible  form 
(see  section  5) ;  the  structure  display  program  connects  shapes  together  and  displays  the 
structures  on  an  interactive  graphics  terminal;  and  the  hard  copy  program  converts  the 
output  from  the  structure  display  program  into  a  form  suitable  for  the  production  of  high 
quality  drawings  on  an  incremental  plotter  (see  section  9) . 

The  structure  display  program  reads  the  data  definition  of  a  structure,  analyses 
it  to  extract  the  shape  numbers  which  make  up  the  structure  and  accesses  the  shape 
definition  data  to  obtain  the  graphical  representation  of  each  shape.  Each  representation 
is  displayed  suitably  positioned  and  rotated  so  that  one  bond  connects  in  a  straight 
line  to  a  bond  of  the  previous  shape.  Given  that  a  shape  can  have  bonds  pointing  in 
several  directions  and  the  next  shape  to  be  connected  to  it  can  also  have  several  bonds, 
the  rules  by  which  shapes  are  connected  together  are  necessarily  rather  complex  in  order 
to  cope  with  all  possible  situations.  In  general,  the  structures  are  displayed  as 
orthogonal  chains,  and  bonds  with  intermediate  angles  are  only  used  when  no  horizontal  or 
vertical  bond  is  available. 

The  program  offers  the  user  a  choice  of  several  options  in  order  to  match  what  the 
program  does  to  the  user's  current  requirements.  The  options  offered  are: 

(a)  Interactive  mode.  The  user  can  be  given  the  opportunity  to  change  the  bond 
lengths  of  any  shape  in  the  structure  displayed,  in  order  to  improve  its  appearance  or 
to  prevent  overlaps  between  adjacent  shapes. 

(b)  Copy  mode.  Once  a  drawing  has  been  accepted  by  the  user,  a  definition  of  the 
drawing  can  be  sent  to  a  disk  file  and  this  can  later  be  processed  to  produce  a  permanent 
and  high  quality  copy  on  an  incremental  plotter. 

(c)  An  alternative  algorithm  for  shape  connection.  A  slightly  different  connection 
algorithm  can  be  selected  if  the  algorithm  normally  used  by  the  program  is  not  producing 
satisfactory  results  for  individual  structures.  This  alternative  algorithm  uses  less 
'built-in  logic’  and  gives  more  weighting  to  the  way  the  user  has  supplied  the  structure 
definitions . 
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(d)  Selective  mode.  Individual  structures  can  be  picked  out  for  display  rather 
than  processing  all  the  structures  in  the  structure  definition  file. 

(e)  Overlap  mode.  The  program  can  check  whether  drawing  overlaps  occur  between 
adjacent  shapes  and  if  so,  it  will  modify  the  connecting  bond  in  an  attempt  to  prevent 
the  overlap. 

(f)  Graphics  mode.  The  program  can  display  on  a  graphics  terminal  all  of  the 
structures  as  they  are  processed,  bp  to  nine  structures  can  be  displayed  at  once. 

(g)  Scale.  The  scale  of  drawing  can  be  modified  to  magnify  small  structures  or 
to  reduce  the  size  of  the  large  structures. 

These  options  are  presented  to  the  user  in  the  form  of  a  menu  displayed  on  the 
graphics  terminal  and  they  can  be  made  active  or  dormant  at  any  time  by  use  of  a  light 
pen  in  association  with  the  menu. 

The  program  thus  provides  the  user  with  a  powerful  array  of  facilities  to  develop 
visually  attractive  chemical  drawings  and  the  man/computer  interface  is  designed  to  be 
easy  to  use  for  those  chemists  who  have  little  computer  experience. 

Appendix  A  contains  an  example  of  shape  definition  data.  Appendix  B  contains 
examples  of  structure  definition  data  and  Figs  7  to  1 3  give  examples  of  drawings  produced 
on  the  incremental  plotter. 

3  THE  HARDWARE  AND  SOFTWARE  ENVIRONMENT 


The  programs  run  on  a  PDPI!  minicomputer  under  the  RSX1 1M  operating  system  and 
they  are  written  in  FORTRAN.  The  structure  display  program  occupies  slightly  less  than 
3?K  words  of  memory. 

The  graphics  terminal  is  a  Vector  General,  refreshed  'line'  display,  with  16K  of 
memory  in  the  interface  from  which  the  picture  on  the  screen  is  refreshed.  The  program 
makes  use  of  the  keyboard  associated  with  the  display  to  take  in  alphanumeric  information 
and  of  the  light  pen  to  allow  the  user  to  identify  lines  and  menu  items  on  the  screen. 

2  3 

The  graphics  software  used  is  the  General  Purpose  Graphics  System  (GPGS)  ’  .  The 
package  consists  of  a  number  of  device  drivers  together  with  a  device  independent  library 
of  FORTRAN  callable  routines.  More  details  of  the  features  of  GPGS  which  are  used  by  the 
programs  are  given  in  Appendix  C. 

4  THE  SHATc.  DEFINITION  FILE 

Each  chemical  shape  is  defined  by  a  shape  number  and  four  categories  of  information: 

(1)  the  lines  making  up  the  shape; 

(2)  the  strings  of  full  sized  characters  annotating  the  shape; 

(3)  the  strings  of  half  sized  characters  providing  further  annotation; 

(4)  the  positions  of  the  connecting  bonds  and  their  angles  to  the  horizontal. 


This  information  is  specified  as  series  of  fixed  format  records  and  the  records  must 


occur  in  the  order: 


line  data 

full  sized  text  data 
half  sized  text  data 
bond  data. 
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Since  many  chemical  shapes  are  similar  in  graphical  structure  a  category  of  data  need  only 
be  supplied  if  the  data  for  the  current  shape  differs  from  that  for  the  same  category  of 
the  previous  shape,  ie  if  a  category  of  data  is  omitted,  the  corresponding  data  for  the 
previous  shape  is  assumed  to  be  repeated.  The  shape  definitions  can  be  supplied  in  any 
order,  ie  not  necessarily  in  shape  number  order,  and  so  similar  shapes  can  be  grouped 
together  to  simplify  this  task  of  shape  definition.  It  is  important  to  note  that  omitting 
a  category  of  data  does  not  mean  that  no  data  is  defined  for  this  category.  If  no  data 
is  to  be  defined,  an  explicit  'no  data'  record  specifying  zero  shape  lines  or  character 
strings  must  be  provided,  unless  the  previous  shape  had  a  'no  data'  record  for  this 
category. 

The  line  data  is  provided  as  a  series  of  relative  (X,Y)  coordinates  together  with 
indications  of  whether  the  resulting  lines  are  to  be  visible  or  invisible;  the  character 
strings  are  specified  as  collections  of  characters,  one  string  per  record;  and  the  bond 
data  for  each  bond  consists  of  the  absolute  (X,Y)  coordinates  of  the  end  point  nearest 
the  parent  shape  together  with  the  angle  of  the  bond  to  the  horizontal.  Valid  bond 
angles  range  from  0°  to  345°  in  anticlockwise  increments  of  15°.  A  detailed  specification 
of  the  shape  data  format  is  given  in  Appendix  A,  together  with  some  examples. 

When  a  shape  is  rotated,  the  centre  point  of  each  character  string  is  rotated  to 
shift  the  position  of  the  string  but  the  text  remains  horizontal,  ie  relative  to  the 
lines  making  up  the  shape,  the  text  is  rotated  about  the  centre  point  of  the  string. 

The  resulting  relationship  between  the  text  and  the  lines  might  not  be  visually  satis¬ 
factory  and  interference  can  occur  between  the  lines  and  the  text  (see  Fig  4b).  The 
program  provides  two  facilities  to  aid  the  user  in  avoiding  some  of  the  worst  instances 
which  occur  during  shape  rotation. 

The  user  can  define  a  rotation  relation  for  the  current  shape.  This  is  a  shape 
which,  when  displayed,  looks  like  the  current  shape  rotated  through  90°  and  thus  the 
program  can  use  the  data  representation  of  the  rotation  relation  instead  of  performing 
a  strict  +90°  or  -90°  rotation  on  the  data  representation  of  the  current  shape.  For 
example,  shapes  6  and  7  are  rotation  relations  (see  Fig  4a). 

The  rotation  relation  can  either  be  a  shape  which  is  defined  elsewhere  in  the  shape 
definition  file  and  has  a  shape  number,  or  the  definition  can  immediately  follow  the 
definition  of  the  current  shape,  in  which  case  it  has  no  separate  shape  number.  The  use 
of  a  rotation  relation  guarantees  that  the  shape  drawing  which  results  from  a  +90°  or 
-90°  rotation  will  be  aesthetically  acceptable. 

Secondly,  the  user  can  direct  the  program  to  maintain  the  relative  positions  of  two 
or  more  character  strings  during  rotation.  This  is  always  necessary  when  the  text  is 
defined  to  create  subscripts,  eg  if  the  shape  is  to  be  annotated  with  the  text  'CH^', 
this  must  be  defined  as  two  strings,  'CH'  and  '2',  and  the  start  position  of  the 
string  '2'  must  be  fixed  relative  to  the  start  position  of  the  string  'CH'  if  the  rotated 
versions  are  to  be  visually  acceptable.  The  user  can  request  in  a  string  definition  that 
the  current  string  is  to  be  'attached'  to  the  preceding  string  for  the  same  shape  and  then 
the  relative  positions  of  the  two  shapes  are  always  maintained. 
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The  order  in  which  the  bonds  are  specified  can  affect  the  layout  of  a  drawing. 

When  attempting  to  match  the  bonds  of  two  shapes,  the  bonds  are  examined  in  the  order  in 
which  they  are  specified  in  the  shape  definition  data  and  so  the  bond  which  is  defined 
first  is  given  preference.  Furthermore,  whenever  the  structure  display  program  has 
to  make  an  arbitrary  selection  of  a  bond,  it  selects  the  next  available  bond,  again 
taking  the  bonds  in  the  order  in  which  they  were  originally  defined.  In  general,  bonds 
should  be  defined,  ordered  according  to  the  preferred  directions  of  development.  Section  7 
gives  further  details  on  the  method  of  shape  connection. 

5  THE  SHAPE  DEFINITION  PROGRAM 

Running  on  a  PDP! 1  minicomputer,  there  is  insufficient  program  address  space  to 
retain  all  the  shape  definitions  in  immediately  accessible  form  in  memory  and  the  shape 
definitions  are  thus  stored  in  a  random  access  file  on  disk.  A  preprocessor  program,  the 
shape  definition  program,  reads  the  user's  shape  definition  file  and  produces  the  random 
access  file  to  be  used  by  the  drawing  program.  The  drawing  program  can  easily  bring  shape 
definitions  into  memory  when  they  are  required  but  an  overhead  in  accessing  the  disk 
unit  is  involved. 

If  the  format  conventions  of  the  shape  definition  data  are  broken,  the  program 
prints  an  error  message  indicating  the  contravention  which  it  has  detected.  Appendix  F 
contains  a  list  of  the  error  messages  which  can  be  produced  during  execution  of  the  shape 
definition  program. 

6  THE  STRUCTURE  DEFINITION  FILE 

The  structure  display  program  produces  drawings  in  accordance  with  instructions 
read  from  the  structure  definition  file.  Each  structure  in  this  file  is  defined  by  a 
text  descriptor  record  and  by  one  or  more  chain  records.  The  text  descriptor  record 
simply  contains  alphanumeric  text  describing  the  structure  to  be  drawn,  eg  its  identifica¬ 
tion  and  perhaps  its  major  chemical  attribute.  Each  chain  record  consists  of  a  series 
of  shape  numbers  separated  by  hyphens  indicating  which  shapes  make  up  the  chain.  An 
example  of  such  a  chain  record  is 

-49-6-163 

This  data  defines  a  chain  consisting  of  shape  49,  shape  6  attached  to  shape  49  and 
shape  163  attached  to  shape  6.  Since  each  structure  can  consist  of  several  linked  chains 
an  asterisk  is  included  after  the  shape  number  whenever  a  shape  is  to  form  a  junction 
between  two  chains.  Such  a  junction  shape  is  known  as  a  link  shape  and  the  number  of 
asterisks  indicates  how  many  chains  are  to  be  connected  to  that  link  shape.  For  example, 

-! 9**-l 84**-6 

indicates  that  two  chains  are  to  be  connected  to  shape  19  and  two  chains  are  to  be  con¬ 
nected  to  shape  184.  Subsequent  records  define  the  side  chains  which  are  to  be 
attached  to  these  shapes  and  these  records  must  be  ordered  to  correspond  with  the  order 
in  which  the  asterisks  are  encountered,  reading  records  sequentially  and  processing 
characters  from  left  to  right.  The  order  of  side  chain  records  for  the  above 
would  be: 
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first  side  chain  to  be  attached  to  shape  19 
second  side  chain  to  be  attached  to  shape  19 
first  side  chain  to  be  attached  to  shape  184 
second  side  chain  to  be  attached  to  shape  184. 

As  an  extra  confirmation  of  the  order  of  the  records  the  first  shape  number  in  each  side 
chain  record  is  the  shape  number  of  the  link  shape  to  which  it  is  to  be  connected  and  it 
is  not  a  request  for  a  shape  to  be  drawn.  Thus  a  partial  structure  definition  might  be: 

GROUP  9999  HAS  4  SIDE  CHAINS 

- 1 9**-l 84**-6 

-19-1 

-19-174 

-184-36 

—  1 84—49— 1 03*— 49 

Here, one  of  the  side  chains  has  a  further  side  chain  attached  to  it. 

The  program  maintains  a  'first  in  -  first  out'  stack  of  all  link  shape  requests  and 
so  very  complex  structures  can  be  drawn. 

If  a  side  chain  record  contains  only  the  number  of  the  associated  link  shape,  the 
side  chain  is  called  a  dummy  side  chain.  Dummy  side  chains  can  be  used  to  force  the 
program  to  abandon  its  preferred  method  of  linking  chains  together  and  to  adopt  a  sequence 
preferred  by  the  user.  A  dummy  side  chain  effectively  links  a  null  shape  to  the  first 
available  bond  of  the  link  shape  concerned.  The  null  shape  cannot  be  seen  but  the  con¬ 
necting  bond  is  marked  as  'occupied'  by  the  program  and  is  not  available  for  subsequent 
linking.  Fig  3  contains  examples  of  how  the  use  of  link  shapes  and  dummy  side  chains  can 
be  used  to  modify  the  appearance  of  a  drawing. 

Appendix  B  contains  further  details  on  the  format  of  structure  definition  data 
together  with  some  examples. 

7  THE  METHOD  OF  SHAPE  CONNECTION 

The  data  contained  in  the  structure  definition  file  does  not  imply  a  unique  method 
of  drawing  each  structure  defined  in  it.  The  program  still  has  many  options  as  to  how 
each  shape  can  be  connected  to  its  neighbour  and  how  chains  of  shapes  can  be  connected 
to  the  link  shapes. 

In  the  following  description,  the  bond  of  the  current  shape  which  is  attached  to 
the  previous  shape  is  referred  to  as  the  entry  bond  of  the  current  shape  and  the  bond  to 
which  it  is  connected  is  known  as  the  exit  bond  of  the  previous  shape. 

The  program  performs  the  following  procedure  in  order  to  construct  a  chain  of 
linked  shapes: 

(1)  It  draws  the  first  shape,  perhaps  rotated  to  create  a  horizontal  bond,  and 
attempts  to  add  succeeding  shapes  in  a  horizontal,  rightward  direction  or  in  a  vertical, 
downward  direction. 
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(2)  It  adds  the  next  shape  in  the  current  direction,  ie  it  selects  an  exit  bond 
opposite  to  (180°  out  of  phase  with)  the  entry  bond.  If  no  such  bond  exists,  it  selects 
the  first  unused  bond  as  the  exit  bond. 

(3)  It  selects  the  entry  bond  of  the  current  shape  by  considering  the  rotation 

necessary  to  connect  each  potential  entry  bond  to  the  selected  exit  bond.  The  program 
contains  a  list  of  bond  phase  differences,  stored  in  order  of  preference.  A  180°  phase 

difference  is  the  first  preference  since  no  rotation  would  be  required  to  connect  the  two 

shapes;  a  0°  phase  difference  is  the  second  preference,  in  which  case  a  mirror  image  rota¬ 
tion  would  be  required;  and  the  next  two  preferences  are  90°  and  270°  when  a  right  angled 
rotation  would  be  required.  These  are  followed  by  the  other  possible  phase  differences 

in  a  rather  arbitrary  order: 

210°  30°  120°  300°  225°  45°  135°  315°  240°  60°  150°  330° 

Using  this  hierarchy  of  phase  differences,  each  bond  of  the  current  shape  is  examined 

first  of  all  for  a  180°  phase  difference  with  the  selected  exit  bond.  If  no  such  match 

is  found  the  program  looks  for  a  0°  phase  difference,  and  so  on  down  the  hierarchy  until 
a  match  is  found  and  the  entry  bond  has  thus  been  selected. 

This  procedure  defines  how  chains  are  constructed  but  a  further  mechanism  is 
required  when  a  new  chain  is  started,  attached  to  a  previously  drawn  link  shape.  In 
this  case,  the  program  has  even  more  options  open  since  there  is  no  immediately  obvious 
way  of  identifying  which  potential  exit  bond  from  the  link  shape  should  be  used.  If  the 
'take  link  shape  exits  sequentially'  option  is  enabled  (see  section  8.2)  the  program 
merely  takes  the  next  free  bond  of  the  link  shape  as  the  exit  bond  and  the  matching 
procedure  is  then  identical  to  (3)  above.  If  the  option  is  disabled,  the  program 

(a)  takes  the  first  free  bond  of  the  link  shape; 

(b)  examines  each  bond  of  the  current  shape  for  a  180°  phase  difference; 

(c)  if  no  match  is  found,  it  examines  each  bond  of  the  current  shape  for  0°,  90° 
and  270°  phase  differences  in  succession,  until  a  match  is  found; 

There  is  thus  a  strong  bias  in  favour  of  using  the  first  free  bond  of  the  link  shape  as 
the  exit  bond  and  this  bond  will  be  selected  when  the  connection  can  be  made  using  a 
rotation  of  a  multiple  of  90°. 

(d)  if  still  no  match  is  found,  the  program  repeats  (b)  and  (c)  for  successive 
bonds  of  the  link  shape  until  a  match  is  found; 

(e)  if  no  match  is  found,  the  program  repeats  (a),  (b) ,  (c)  and  (d)  but  using 
phase  differences  210°,  30°,  120°  and  300°  successively; 

(f)  if  no  match  is  found,  the  program  repeats  (a),  (b) ,  (c)  and  (d)  but  using 
phase  differences  225°,  45°,  133°  and  315°  successively; 

(g)  if  no  match  is  found,  the  program  repeats  (a),  (b) ,  (c)  and  (d)  but  using 
phase  differences  240°,  60°,  150°  and  330°  successively. 

Again,  the  search  priority  is  arbitrary  after  the  first  four  phase  differences.  The 
search  priority  can  be  represented  in  tabular  form  and  Table  i  gives  the  order  of  priority 
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for  a  link  shape  with  three  free  bonds  to  be  connected  to  the  current  shape  which  has 
two  bonds. 

8  THE  STRUCTURE  DISPLAY  PROGRAM 

8 . 1  Initialisation  phase 

The  structure  display  program  poses  a  series  of  initial  questions  in  order  to 
obtain  the  basic  information  necessary  to  start  the  program  run: 

(a)  DATA  FILE  NAME? 

The  user  replies  by  typing  on  the  terminal  the  name  of  the  file  containing  the  definition 
of  the  structures  to  be  drawn, 

eg 

STRUCTS.DAT  (RETURN) 
or 

DK2 :  [  1 , 6]  POLYMS .  DAT  (RETURN) 


(b)  COPY  FILE  NAME? 

The  program  requires  the  name  of  the  output  file  which  is  to  contain  the  drawing 
definitions  to  be  used  for  the  production  of  hard  copies.  The  file  name  is  only  used 
if  the  hard  copy  option  is  activated  and  so  the  response  is  irrelevant  if  the  user  does 
not  intend  to  produce  hard  copy  output.  However,  the  reply  must  still  be  a  valid 
RSX11M  file  name  or  an  error  message  is  produced  by  the  RSX11M  file  system. 

If  a  file  of  the  given  name  already  exists,  a  new  generation  of  the  file  is 
created  and  the  old  version  is  untouched.  Examples  of  valid  responses  are 

X  (RETURN) 

STRUCT S. COP  (RETURN) 

(r)  SCREEN  ARRAY  SIZE? 

The  user  can  choose  to  collect  the  drawings  into  1,  4  or  9  element  arrays.  The 
reply  to  this  question  can  thus  be 


1  (RETURN) 

2  (RETURN) 
or 

3  (RETURN) 

to  indicate  a!  *1,2*2  or  3x3  matrix  of  drawings.  Any  other  value  will  be  rejected 
and  the  question  will  be  asked  again.  When  using  the  Vector  General  display,  the  larger 
the  array  size,  the  smaller  will  be  the  size  of  the  individual  drawings  displayed  on  the 
screen. 

8. 2  Setting  the  program  environment 

The  program  reads  the  first  array  of  structures  from  the  structure  definition  file 
and  displays  the  drawings  on  the  Vector  General  screen.  In  addition  it  displays  a  menu 
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of  environment  options  across  the  top  of  the  screen.  An  option  can  be  switched  on  and  off 
alternately  by  'picking'  the  corresponding  text  with  the  1 ight  pen.  An  option  which  is 
activated  has  a  '#  '  sign  displayed  by  the  side  of  the  corresponding  menu  item. 

A  menu  item  is  picked  by  pointing  the  light  pen  at  the  appropriate  text.  When 
the  pen  can  'see'  the  text,  a  'A'  symbol  is  displayed  beneath  the  text  and  the  selection 
can  then  be  confirmed  by  touching  the  metal  tip  of  the  pen  with  a  finger  of  the  hand 
holding  the  pen. 

The  options  available  are; 

(a)  Copy  mode  (menu  item  COPY):  when  the  option  is  enabled,  the  program  produces  an 
output  file  containing  a  definition  of  all  drawings  processed.  This  output  file  can 
later  be  processed  to  produce  hard  copy  output  (see  section  9).  This  facility  is  used 

for  production  runs. 

(b)  Trace  (menu  item  TRACE):  when  this  option  is  selected  the  program  prints  out 
a  short  piece  of  text  together  with  a  series  of  values  whenever  a  significant  program 
decision  is  taken.  This  facility  is  not  intended  for  general  use  but  is  an  aid  to  be  used 
when  debugging  the  program. 

(c)  Batch  mode  (menu  item  BATCH):  the  program  can  either  operate  in  interactive 
mode,  where  the  user  has  the  opportunity  to  examine  each  drawing  and  to  make  modifications 
to  it;  or  in  batch  mode,  where  the  program  does  not  pause  between  drawings,  but  processes 
a  whole  series,  one  after  the  other.  If  the  user  activates  the  BATCH  option,  the 
question  is  asked: 

BATCH  SIZE  (13)  ? 

to  which  the  user  replies,  on  the  Vector  General  keyboard,  with  three  characters  (spaces 
and/or  right  justified  numbers)  specifying  the  number  of  arrays  of  structures  to  be 
processed  in  the  batch.  For  example,  valid  replies  are: 

123  (RETURN) 

VI  2  <  RETURN  > 

VV9  <  RETURN  > 

If  a  negative  number  is  returned,  then  the  program  processes  all  the  drawings  in  the 
structure  definition  file. 

(d)  Graphics  on  Vector  General  display  (menu  item  VG) :  when  the  drawings  are  to 
be  processed  in  BATCH  mode,  it  is  often  convenient  to  suppress  the  display  of  the 
structures  on  the  Vector  General  screen.  Selection  of  this  menu  item  causes  all 
drawing  instructions  to  be  ignored,  the  menu  is  removed,  preventing  all  further  inter¬ 
action,  and  all  of  the  remainder  of  the  structure  definition  file  is  processed  as  a 
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(e)  Link  shape  exits  to  be  used  sequentially  (menu  item  SEQ  EXIT):  the  program 
offers  two  methods  of  connecting  the  next  shape  to  be  processed  onto  a  previously  drawn 
link  shape  (see  section  7).  If  this  option  is  selected  then  the  bonds  for  the  link 
shape  are  used  sequentially,  ic  in  the  order  in  which  they  were  specified  in  the  shape 
definition  file,  and  the  most  suitable  bond  of  the  next  shape  is  fitted  onto  the  next 
free  bond  of  the  link  shape.  If  the  default  algorithm  is  used,  then  each  of  the  link 
shape  bonds  is  compared  with  each  of  the  bonds  of  the  next  shape,  and  the  most  suitable 
pairing  is  selected.  These  two  alternatives  allow  the  program  to  use  its  intelligence 
in  drawing  the  majority  of  the  structures,  but  for  a  minority  of  awkward  cases  any 
required  drawing  scheme  can  be  imposed  to  over-ride  the  preference  built  into  the 
program. 

(f)  Overlap  check  (menu  item  OVERLAP):  the  program  offers  the  option  of 
automatic  detection  and  attempted  correction  of  any  overlaps  which  occur  when  drawing 
the  structures.  If  this  option  is  selected  the  program  retains  the  maximum  and  minimum 
X  and  Y  coordinates  of  each  shape  drawn  and  before  drawing  the  current  shape,  it  checks 
whether  there  is  any  overlap  between  the  rectangle  containing  the  current  shape  and  the 
rectangles  containing  the  earlier  shapes.  If  there  is  an  overlap,  it  attempts  to  avoid 
the  difficulty  by  first  shortening  the  entry  bond,  and  then  incrementing  its  length  up 
to  14  times  the  standard  bond  length.  If  all  overlap  is  prevented  the  shape  is  drawn 
with  the  extended  bond;  if  overlap  still  occurs  it  is  drawn  with  a  bond  length  15  times 

the  standard  bond  length  and  it  is  left  to  the  user  to  remove  the  overlap  during  the  inter¬ 
active  phase.  The  overlap  check  never  operates  on  a  bond  whose  length  has  been  specified 
by  the  user  during  the  interaction  phase  and  so  the  program  cannot  over-rule  the  user's 
requests.  This  interference  check  is  rather  crude,  and  the  corrective  logic  keeps 
shapes  further  apart  than  is  absolutely  necessary.  In  addition,  the  remedy  which  it 
applies  only  acts  on  the  entry  bond  to  the  current  shape  and  not  on  an  earlier  shape  which 
might  be  the  real  cause  of  the  problem.  However,  it  is  effective  for  the  majority  of 
simple  cases  of  overlap. 

(g)  Selective  mode  (menu  item  SELECT):  this  option  allows  the  user  to  select 
individual  structures,  one  at  a  time,  from  the  data  file  or  to  select  a  starting  point 
within  the  data  file.  If  this  option  is  activated,  then  the  program  asks 

NEXT  GROUP  ?  -  4  CHARS 

The  program  expects  the  user  to  type  four  characters  which  define  the  next  (or  first) 
structure  to  be  processed.  The  four  characters  received  are  compared  with  all  four 
character  strings  in  the  text  descriptors  associated  with  successive  records  from  the 
structure  definition  file  and  processing  only  starts  when  a  match  has  been  found  some¬ 
where  within  a  text  descriptor.  The  four  character  identifier  can  generally  be  supplied 
in  'free  format'  if  the  significant  items  in  the  text  descriptors  which  are  used  to 
identify  structures  are  followed  by  space  characters.  If  the  user  supplies  less  than 
four  characters,  then  the  number  is  made  up  to  four  by  the  addition  of  trailing  spaces. 
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Thus , 


10  (RETURN)  is  equivalent  to  107V  (RETURN) 


and  this  will  be  matched  with  a  string  '10VV'  in  a  text  descriptor.  Similarly,  a  reply 
of  (RETURN)  alone  is  equivalent  to  a  reply  of  4  spaces.  This  will  be  matched  with  the 
next  text  descriptor  (unless  the  text  descriptor  contains  more  than  31  characters  without 
four  consecutive  spaces)  and  the  next  structure  is  processed.  In  the  RAE  application, 
the  text  descriptors  contain  the  group  number,  right  justified.  Thus  a  reply  of 


V  890  (RETURN) 


indicates  that  group  number  890  is  the  next  group  to  be  processed  and  the  structure 
definition  file  is  searched  sequentially  until  this  group  is  found. 

If  BATCH  mode  has  been  selected,  selective  mode  is  turned  off  automatically  and 
processing  continues  from  this  point.  The  record  indicated  is  thus  the  first  record  of 
the  batch.  If  BATCH  mode  has  not  been  selected,  the  program  returns  after  the  structure 
has  been  accepted  by  the  user  to  ask: 

NEXT  GROUP  ?  -  4  CHARS 


and  the  cycle  is  repeated.  The  user  can  select  another  structure  which  is  defined 
further  down  the  data  file.  Thus  structures  can  be  processed  selectively,  one  at  a  time 
and  in  the  order  in  which  they  are  defined  in  the  data  file.  If  the  option  is  switched 
off  on  any  occasion,  the  program  returns  to  sequential,  interactive  processing  and  the 
record  selected  is  thus  the  starting  point  for  an  interactive  session. 

If  the  program  fails  to  find  the  required  text  descriptor,  the  program  terminates 
when  all  of  the  structure  definition  file  has  been  read. 

(h)  Change  scale  (menu  item  SCALE):  the  initial  drawing  area  for  a  structure  is 
300  units  by  300  units.  For  the  display  of  a  single  shape,  this  area  is  often  too  large 
and  the  drawing  would  benefit  from  scaling  up;  if  for  the  display  of  a  complex  polymer, 
the  area  is  too  small,  data  is  lost  over  the  edges  and  the  drawing  would  benefit  from 
scaling  down.  When  the  SCALE  menu  item  is  selected  the  program  displays  the  message: 

TYPE  SCALE  FACTOR  (F  3.1) 

and  the  user  replies  with  three  characters,  including  a  decimal  point,  on  the  Vector 
General  keyboard  to  indicate  the  desired  scaling  factor,  eg 

2.0  (RETURN) 

or 

0.5  (RETURN) 
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The  initial  drawing  area  is  divided  by  the  factor  supplied  to  give  the  current  drawing 
area.  Thus  a  factor  of  2.0  indicates  that  the  drawing  area  mapped  onto  the  screen  is 
halved  in  the  X  and  Y  directions  and  the  size  of  the  drawing  is  doubled.  The  new  scale 
remains  in  force  until  the  SCALE  menu  item  is  selected  again.  Returning  a  factor  of  1.0 
restores  the  drawing  area  to  its  default  state. 

(i)  Redraw  the  current  array  (menu  item  REDRAW):  until  a  structure  has  been 
drawn  the  program  cannot  know  the  bounds  of  the  drawing.  Having  once  drawn  it,  the 
program  knows  how  far  to  offset  the  drawing  of  each  structure  in  order  to  centre  it  in 
the  drawing  area  and  the  REDRAW  menu  item  allows  the  user  to  request  that  the  current 
array  of  structures  be  re-displayed  with  each  structure  centred  in  its  drawing  area. 

(j)  Cease  execution  (menu  item  STOP):  Program  execution  can  be  terminated  by 
selecting  the  STOP  menu  item. 

Menu  items  can  be  selected  at  any  time  but  the  selection  is  only  activated  when 
the  drawing  of  the  current  array  of  structures  has  been  completed.  Selection  of  the 

TRACE,  SEQ  EXITS,  OVERLAP  or  SCALE 

menu  items  causes  the  current  array  of  structures  to  be  re-drawn  with  the  new  option 
in  force. 

In  the  initial  program  environment,  the  overlap  check  is  activated,  and  copy  mode 
and  selective  mode  are  switched  off. 

8. 3  Shape  processing 

The  structure  display  program  reads  the  structure  definitions  from  the  input  data 
file  and  sends  the  drawing  instructions  to  the  Vector  General  display  and/or  to  a 
'copy'  output  file.  As  each  structure  definition  is  read,  the  program  prints  the 
as:;,  •.•iated  text  descriptor  on  the  terminal  as  a  permanent  record  of  which  structures  have 
been  processed. 

In  BATCH  mode  no  action  is  required  from  the  user.  The  program  continues  processing 
structures  until  the  required  batch  size  has  been  processed  or  until  the  structure 
definition  file  has  been  exhausted.  In  both  cases,  the  program  ceases  execution  and  is 
removed  from  memory. 

In  interactive  mode,  the  user  can  examine  each  array  of  structures  after  it  has  been 
displayed,  and  then  optionally  modify  any  of  the  bond  lengths.  When  a  user  response  is 
required  the  program  rings  the  terminal  bell  as  a  prompt  and  the  user  then  has  the  option: 

la)  to  accept  the  array  of  structures  as  displayed  and  to  proceed  to  the  next 
structure.  In  this  case,  the  user  just  replies  with: 

(RETURN) 

on  the  Vector  General  keyboard,  and  the  program  rings  the  terminal  bell  again  to  indicate 
that  the  interaction  has  been  accepted; 


(b)  to  change  the  length  of  a  bond  in  any  of  the  structures  displayed  for  the 
purposes  of  improving  the  appearance  of  the  structure.  The  user  points  the  light  pen  at 
the  bond  in  question.  When  the  pen  is  pointing  directly  at  a  bond,  a  'A'  symbol  is  dis¬ 
played  at  the  end  of  the  bond  nearest  to  its  parent  shape.  To  trigger  the  interaction, 
the  user  touches  the  metal  tip  of  the  pen  with  a  finger  of  the  hand  holding  the  pen 
whilst  the  pen  is  pointing  at  the  required  bond.  The  program  rings  the  terminal  bell 
to  indicate  that  the  interaction  has  been  accepted  and  the  user  then  types  a  single  digit 
number  (0  to  9)  and  (RETURN)  on  the  Vector  General  keyboard.  The  program  multiplies 
this  number  by  the  standard  bond  length  to  determine  the  requested  bond  length  whenever 
the  selected  bond  is  drawn.  Thus  the  separation  between  shapes  can  be  varied  at  the 
user's  request  between  zero  and  9  times  the  standard  separation,  and  all  changes  in 
bond  length  are  faithfully  reproduced  in  the  'copy'  output  file.  Alterations  in  bond 
lengths  can  be  used  to  avoid  shape  overlaps  when  complex  shapes  are  being  drawn. 

After  the  amended  picture  has  been  drawn,  the  user  is  again  faced  by  the  two  options 
(a)  and  (b)  described  above  and  this  interactive  loop  is  pursued  until  option  (a)  is 
selected. 

Appendix  D  gives  a  step  by  step  description  of  the  structure  display  program  as  it 
displays  a  chemical  structure  and  Appendix  E  contains  the  program  flowcharts.  Appendix  F 
contains  a  list  of  error  messages  which  can  be  produced  during  execution. 

9  THE  HARD  COPY  PROGRAM 

The  hard  copy  program  reads  the  drawing  definition  file  created  by  the  structure 
display  program  and  produces  a  magnetic  tape  containing  drawing  instructions  in  a  format 
suitable  for  processing  on  a  Calcomp  905/936  plotting  system.  The  plotter  then  produces 
drawings  identical  to  those  displayed  on  the  Vector  General  screen. 

The  program  asks  for  the  name  of  the  drawing  definition  file: 

COPY  FILE  NAME  ? 

and  the  user  replies  with  the  name  of  the  file  which  was  supplied  to  the  structure  dis¬ 
play  program  for  use  as  the  'copy'  file. 

If  the  magnetic  tape  unit  is  not  'ONLINE'  with  a  magnetic  tape  positioned  at  the 
beginning  of  tape  marker  and  write  enabled,  the  program  produces  the  message: 

MAG  TAPE  NOT  READY  -  PRESS  RETURN  TO  RETRY 

When  the  user  replies  with 

(RETURN) 

on  the  terminal,  the  program  checks  the  status  of  the  magnetic  tape  unit  and  then  either 
repeats  the  message  or  initialises  the  magnetic  tape  to  receive  the  drawing  instructions. 
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The  program  instructs  the  user: 

TYPE  IN  PLOT  SIZE  IN  CMS  (F3.0) 

The  reply  consists  of  two  figures  separated  by  a  decimal  point,  eg 


or 


4.5  (RETURN) 

8.0  (RETURN) 


The  number  provided  by  the  user  defines  a  square  which  is  to  contain  the  plot  of  one  array 
of  structures  as  displayed  on  the  Vector  General  screen.  The  program  further  needs  to 
establish  how  the  collection  of  these  square  plots  is  to  be  laid  out  on  the  paper.  The 
squares  can  be  collected  together  to  form  a  'page'  and  the  pages  can  be  collected  in 
columns  so  that  a  minimum  of  plotting  paper  is  used.  The  program  instructs: 

TYPE  NO.  ACROSS  &  NO.  DOWN /PAGE (21 2) 


and  it  expects  a  reply  of  2  two  digit  numbers  or  2  one  digit  numbers  with  leading  spaces 
to  specify  how  the  drawings  are  collected  into  pages.  Examples  of  valid  replies  are: 

1 0 1 0  (RETURN)  ie  10  across  and  10  down 

V2V3  (RETURN)  ie  2  across  and  3  down 


The  next  instruction  is: 


TYPE  NO.  OF  PAGES  DOWN  (12) 


and  the  program  expects  a  reply  of  a  two  digit  number  or  one  digit  number  with  a  leading 
space  to  specify  how  many  pages  are  to  be  drawn  in  a  column,  eg  valid  replies  are: 

lO(RETURN)  ie  iO  pages/column 

73  (RETURN)  ie  3  pages/column 

The  drawing  area  of  each  plot  is  delimited  by  the  lines  of  the  corresponding  square, 
each  plot  within  a  page  is  separated  from  its  neighbours  by  a  gap  of  0.5  cm  and  pages 
are  separated  by  a  gap  of  4  cm.  Since  the  width  of  the  plotter  paper  is  86  cm,  the 
user  should  choose  values  such  that 


p(sd  +  0.5(d  -  1))  +  4(p  -  I )  <  86 

where  d  is  the  number  of  plots  vertical ly/page 
p  is  the  number  of  pages/column 
s  is  the  size  of  the  plot  in  centimetres. 
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The  values  used  to  produce  the  drawings  in  Appendix  C  are: 
the  plot  size  is  8.0  cm 

a  page  consists  of  two  plots  across  and  three  down 
three  pages  are  plotted  per  column. 

At  the  start  of  each  plot  the  program  rings  the  terminal  bell  to  indicate  to  the 
user  how  quickly  the  drawings  are  being  processed.  In  addition,  at  every  tenth  plot,  it 
prints  the  total  number  of  plots  processed  so  far  so  that  the  user  has  an  approximate 
record  of  how  many  plots  are  stored  on  the  magnetic  tape. 

The  contents  and  validity  of  a  drawing  definition  file  can  quickly  be  checked  by 
running  a  different  program  which  reproduces  the  drawings  on  the  Vector  General  screen. 

10  PROGRAM  LIMITATIONS 

The  size  of  various  data  fields  are  fixed  within  the  program,  mainly  by  declaring 
fixed  size  arrays  in  the  FORTRAN  code  and  this  imposes  limitations  on  the  scope  of  prob¬ 
lem  which  can  be  handled.  The  main  limitations  are: 

(1)  Each  shape  definition  can  have  up  to 

(a)  83  X  and  Y  coordinate  pairs 

(b)  15  full  sized  text  strings  of  up  to  5  characters 

(c)  2  half  sized  text  strings  of  up  to  15  characters 

(d)  10  bonds  with  angles  0°  to  360°  in  15°  increments. 

(2)  The  maximum  number  of  shapes  which  can  be  defined  is  400. 

(3)  The  maximum  shape  number  is  500. 

(4)  Up  to  20  link  shapes  can  be  stored  on  the  'future  link  shape'  stack. 

(5)  Up  to  40  bonds  can  have  a  non-standard  length  defined  by  the  user  in  any  array 

of  structures. 

(6)  Each  structure  is  displayed  by  default  in  a  300  x  300  display  area.  Data 

appearing  outside  this  area  is  normally  not  displayed  and  lines  are  clipped  at 

the  boundary. 

(7)  An  array  of  structures  can  be  defined  by  up  to  100  structure  description  records 
consisting  of  up  to  1500  non-space  characters. 

(8)  Eight  shape  descriptions  are  maintained  in  the  cyclic  buffer. 

(9)  When  the  overlap  option  is  used,  a  structure  may  consist  of  up  to  100  shapes. 

(10)  A  text  descriptor  record  can  consist  of  up  to  35  characters. 

In  addition,  certain  fixed  values  are  written  into  the  program: 

(1)  The  standard  bond  length  is  5  units  and  thus  shapes  are  normally  separated 
by  10  units. 

(2)  Full  sized  characters  are  drawn  on  a  6  *  7  matrix  but  commonly  only  the 
first  five  units  in  X  are  used  to  allow  for  the  spacing  between  characters. 


(3)  Half  sized  characters  are  drawn  4/7  the  size  of  full  sized  characters. 

1 1  CONCLUSION 

The  objective  in  writing  this  system  was  to  take  the  data  already  in  use  with  the 
polymer  analysis  programs  and  to  produce  visually  attractive  chemical  diagrams  represent¬ 
ing  the  structures  defined.  Much  initial  effort  has  been  expended  in  the  definition  of 
the  shape  data  and  m  the  development  of  the  program  to  display  the  diagrams.  The 
resulting  programs  and  data  now  constitute  a  production  tool  which  can  be  used  to 
create  chemical  structure  drawings  in  large  numbers.  Furthermore,  the  system  is 
flexible  in  that  shape  definitions  are  easily  added  to  the  shape  definition  file  and  so 
the  application  of  the  system  is  much  wider  than  the  area  of  polymer  chemistry. 

Acknowledgments 

Grateful  thanks  are  due  to  Dr  W.A.  Lee  of  Materials  Department  who  has  provided 
the  expertise  in  chemistry  which  was  necessary  to  produce  the  system  definition  and  has 
supported  the  project  with  consistent  enthusiasm  throughout  its  development. 

The  assistance  of  The  Rubber  and  Plastics  Research  Association  of  Great  Britain 
in  the  preparation  of  the  shape  definition  data  is  also  gratefully  acknowledged. 


Of.0 


19 

Appendix  A 

FORMAT  OF  THE  SHAPE  DEFINITION  DATA 
A. I  The  format  specification 

The  data  for  each  shape  defines: 

(a)  the  shape  number 

(b)  the  lines  making  up  the  shape 

(c)  strings  of  full  sized  characters  and  their  positions 

(d)  strings  of  half  sized  characters  and  their  positions 

(e)  the  positions  of  the  connecting  bonds  and  their  angles  to  the  horizontal. 

The  shape  definitions  can  be  supplied  in  any  order  and  a  category  of  data  need  only  be 
supplied  if  the  data  for  the  current  shape  differs  from  the  same  category  for  the  previous 
shape. 

The  first  record  for  a  shape  contains  a  space  character  (column  1)  and  the  shape 
number  (columns  2  to  4) . 

The  line  data  is  provided  as  a  record  containing  an  'S'  (column  l)  and  the  number 
of  lines  in  the  shape  (columns  6  to  8) ,  followed  by  a  series  of  records  giving  the  X-Y 
coordinates.  Each  X-Y  coordinate  pair  defines  a  line  relative  to  the  coordinates  of  the 
endpoint  of  the  previous  line,  or  relative  to  the  point  (0,0)  for  the  first  line  of  a 
shape.  Each  record  contains  up  to  seven  pairs  of  relative  coordinates  and  the  format  is: 


Columns:  (12  to  15,  16  to  19)  (22  to  25,  26  to  29)  .  (72  to  75,  76  to  79) 

DX1  DY1  DX2  DY2  DX7  DY7 


Invisible  vectors  are  specified  by  adding  500  to  a  positive  X  (but  not  Y)  value  and 

subtracting  500  from  a  negative  X  (but  not  Y)  value.  If  the  shape  contains  no  lines, 

then  the  number  of  lines  must  be  specified  as  zero  and  the  coordinate  records  omitted. 
Brackets  may  be  included  in  the  line  data  to  enable  the  data  to  be  read  more  easily. 

For  example,  the  line  data  for  a  shape  999  which  is  a  square  of  20  units  inside  a  square 
of  30  units  could  be  written  as  follows: 

999 

S  10 

(  510  -10) (  0  20) (  -20  0)(  0  -20) (  20  0) (  505  -5) (  0  30) 

(  -30  0)(  0  -30)  (  30  0) 

The  first  record  of  the  full  sized  text  data  contains  an  'F'  (column  1)  and  the 
number  of  full  sized  text  strings  associated  with  the  shape  (columns  6  to  8).  Each 
subsequent  record  defines  one  of  these  strings  in  the  following  way: 

(a)  the  X  and  Y  absolute  coordinates  of  the  start  position  of  the  string 
(columns  12  to  15  and  16  to  19); 

(b)  the  number  of  characters  in  the  string  (columns  21  to  25); 

(c)  the  characters  in  the  string,  left  justified  (columns  31  to  46). 
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For  example,  the  full  sized  text  data  to  label  the  outer  square  of  the  shape  previously 
defined  might  be: 

F  1 

(  -14  -14)  5  OUTER 

The  records  for  the  half  sized  text  are  identical  in  format  to  those  for  the 
full  sized  text  data.  For  example,  the  half  sized  text  data  to  label  the  inner  square 
of  the  shape  999  might  be: 

H  1 

(  -9  -9)  5  INNER 

The  first  record  for  the  bond  data  contains  a  'B'  (column  1)  and  the  number  of 
bonds  associated  with  the  shape  (columns  6  to  8) .  Each  subsequent  record  provides  up 
to  four  bond  definitions  and  each  bond  definition  provides  the  angle  of  the  bond  to 
the  horizontal  followed  by  the  absolute  coordinates  of  the  bond  start  position.  Valid 
bond  angles  range  from  0°  (horizontal,  pointing  right)  to  345°  in  anticlockwise  increments 
of  15°  and  each  shape  must  have  at  least  one  bond.  The  format  for  bond  records  is  as 
follows: 

Columns:  (12  to  15,  16  to  20,  21  to  24)  (27  to  30,  31  to  35,  36  to  39) 

Angle  1  XI  V 1  Angle  2  X2  Y2 

(42  to  45,  46  to  50,  51  to  54)  (57  to  60,  61  to  65,  66  to  69) 

Angle  3  X3  Y3  Angle  4  X4  Y4 

For  example,  the  bond  data  to  add  an  outward  pointing  bond  on  each  corner  of  shape  999 
might  be: 

B  4 

(  45  10  10)(  135  -10  10)(  315  10  -10)(  225  -10  -10) 

The  data  for  a  shape  is  terminated  by  a  record  having  a  space  character  in 
column  1  and  this  record  should  contain  a  shape  number  as  the  first  item  of  data  lor  the 
next  shape. 

A. 2  Overcoming  the  problems  of  rotation 

When  a  shape  is  rotated,  the  centre  point  of  each  string  is  rotated  to  shift  the 
position  of  the  string  but  the  text  remains  horizontal,  ie  relative  to  the  lines  making 
up  the  shape,  the  text  is  rotated  about  the  centre  point  of  the  string.  The  rotated 
relationship  between  the  text  and  the  lines  might  not  be  visually  satisfactory  (see 
Fig  4c)  and  interference  can  occur  between  the  text  and  the  lines  (see  Fig  4b).  The 
user  can  take  two  steps  to  avoid  the  worst  of  these  cases: 

(a)  define  a  rotation  relation  for  the  shape 

(b)  define  all  text  strings  with  great  care. 
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The  first  record  for  a  shape  can  optionally  contain  an  item  of  information  to 
indicate  how  the  shape  should  be  rotated  when  the  program  finds  it  necessary  to  do  so. 
Most  shapes  rotate  satisfactorily  by  180°  to  generate  a  mirror  image  but  problems  do 
occur  when  rotations  of  +90°  and  -90°  are  necessary.  For  such  asymmetrical  shapes,  the 
program  needs  data  for  the  shape  as  it  should  appear  when  rotated  through  90°  which  can 
be  used  in  place  of  the  shape  data  for  the  unrotated  shape.  Such  a  related  shape  is 
known  as  a  rotation  relation,  Columns  6  to  8  of  the  first  record  indicate  whether 
a  rotation  relation  exists  and  they  can  take  one  of  the  following  three  values: 

(a)  the  number  of  a  shape  which  is  defined  elsewhere  in  the  data.  For  example, 
shapes  6  and  7  are  rotation  relations  (see  Fig  4a); 

(b)  a  value  of  -1,  meaning  that  the  data  definition  of  a  rotation  relation 

immediately  follows  the  definition  of  the  current  shape.  This  rotation  relation  does  not 
have  a  shape  number,  it  cannot  be  accessed  directly  by  the  user's  structure  definition 
requests  and  it  is  only  drawn  when  the  current  shape  is  to  be  rotated  by  +90°  or  -90°; 

(c)  a  value  of  0  or  spaces  indicating  that  there  is  no  rotation  relation  and 

a  strictly  rotated  form  of  the  shape  is  always  to  be  displayed. 

Defining  rotation  relations  caters  for  rotations  of  +9C°  and  -90°  when  dealing 
with  awkward  shapes  (see  Fig  4b),  but  problems  can  occur  even  with  relatively  simple 
rotations  of  180°  if  the  text  data  is  not  carefully  defined  (see  Fig  4c).  In  this 
example,  the  bonds  should  always  point  at  the  centre  of  the  C  atom  but  a  strict 
rotation  of  180°  forces  the  bonds  to  point  at  the  centre  of  the  H  atom.  In  order  to 
overcome  this  problem,  the  'C'  must  be  defined  as  a  separate  string  so  that  it  rotates 
about  its  own  centre  and  retains  its  position  relative  to  the  bond  lines.  In  some  way, 
the  location  of  the  'H'  must  be  defined  relative  to  the  position  of  the  'C'.  If  the 
number  of  characters  in  a  string  is  specified  as  negative,  the  program  forces  the  current 
string  to  retain  the  same  position  relative  to  the  preceding  string,  whatever  rotation 

is  performed,  eg  the  text  data  for  the  example  given  might  be: 

F  2 

(  4  10)  I  C 

(  10  10)  -1  H 

Note  that  the  coordinate  position  of  the  'H'  is  still  expressed  in  absolute  (unrotated) 
coordinates  but  it  will  always  retain  a  position  (+6,  0)  relative  to  the  start  position 
of  the  string  'C',  whatever  rotation  is  performed,  ie  the  string  'H'  is  attached  to  the 
string  'C'.  An  attached  relationship  can  exist  between  several  successive  strings,  even 
between  the  last  full  sized  text  string  and  the  first  half  sized  text  string,  but  the 
first  string  for  a  shape  cannot  be  an  attached  string  since  there  is  no  preceding  string 
to  which  it  can  be  attached.  Attached  strings  are  always  used  for  defining  subscripts, 
eg  for  a  shape  containing  'CH^',  the  text  data  might  be  3 
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F  3 

(  0  0)  1  C 

(  6  0)  -1  H 

(  12  -5)  -1  2 

The  use  of  attached  strings  to  define  subscripts  can  itself  create  unwanted  side 
effects  during  rotation  (see  Fig  5).  Here  the  problem  is  caused  by  the  fact  that  no 
allowance  is  made  for  the  presence  of  the  subscript  when  the  shape  is  rotated.  This 
problem  can  be  avoided  by  including  a  'dummy'  space  character  in  the  'F'  string  so  that 
the  string  is  rotated  about  the  approximate  centre  of  the  string  '^2' •  rat^er  than  about 
the  centre  of  the  string  'F'. 


( 

17  13) 

2 

F 

( 

23  -18) 

-1 

2 

( 

17  -29) 

2 

F 

( 

23  -34) 

-1 

2 

( 

-27  -13) 

2 

F 

( 

-21  -18) 

-1 

2 

( 

-27  -29) 

2 

F 

( 

-21  -34) 

-1 

2 

To  be  able  to  cope  with  a  general  rotation  without  interfering  with  line  data  or 
other  character  data,  a  string  and  any  attached  strings  must  be  free  to  trace  out  a 
circle  with  a  centre  at  the  point  of  the  rotation  and  the  distance  to  the  'furthest' 
text  point  from  the  centre  as  the  radius,  ie  in  Fig  6  if  the  boxes  represent  characters, 
then  no  line  data  or  other  character  data  must  intrude  into  the  circles  if  the  shape  is 
to  be  rotated  satisfactorily  through  all  angles. 

A. 3  Examples  of  shape  definition  data 

The  following  are  some  examples  of  shape  definition  data: 

(1)  The  data  listed  below  defines: 

(a)  Shape  63  which  consists  of  22  lines  and  has  two  bonds; 

(b)  Shape  64  which  consists  of  the  same  lines  as  shape  63,  and  has  two  bonds; 

(c)  Shape  500  which  is  a  dummy  shape  consisting  of  10  bonds  only,  all 
positioned  at  point  (0,0)  and  varying  in  angle  from  0°  to  315°  in  45°  incre¬ 
ments  to  form  an  asterisk.  This  shape  is  used  to  check  the  appearance  of 
other  shapes  when  they  are  rotated. 

The  shape  definition  data  for  shapes  63,  64  and  500  (see  Fig  7)  is  as  follows: 

63  0 

S  22 


F  0 

H  0 

B  2 

64  0 

B  2 


16 

0)( 

9 

1  5)  (  -9  1  5)  ( 

9 

1 5)  (  -9 

1  5)  ( 

-16 

0)( 

-9  -15) 

9 

-15)( 

-9 

-1 5) (  9  -15)  ( 

500 

4)  (  -6 

10)  ( 

506 

1 6)  ( 

16  0) 

-502 

-6 

2)( 

-10) 

-12 

0) (-508  1 4) ( 

6 

10)(  516 

0)( 

6 

-10)  ( 

500  -32) 

( 

0  25 

45)  ( 

180 

-9  45) 

( 

0  25 

15)  ( 

180 

-9  45) 
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500 


B 

o 

oo 

'• — ' 

o 

0 

0)( 

90  0  OH  270  0  0)(  0 

0  0) 

(  0 

0 

OH 

180  0  0)(  135  0  OH  315 

0  0) 

(  45 

0 

OH 

225  0  0) 

(2) 

The  data 

for 

shape 

153  illustrates  how  a  subscript  can 

be  defined  as  a  string 

attached 

to  the  preceding  string: 

153 

0 

S 

6 

(  15 

~9)  ( 

0  -1 

6) (  -15  -9) (  -15  9) (  0  16) (  15 

9)  ( 

F 

1 1 

-6 

0 

1 

F 

17  - 

13 

2 

F 

23  - 

■18 

-1 

2 

-2  - 

•43 

2 

F 

4  - 

•48 

-1 

2 

17  - 

■29 

2 

F 

23  - 

34 

-1 

2 

-27  - 

13 

2 

F 

-21  - 

18 

-1 

2 

-27  - 

29 

2 

F 

-21  - 

34 

-1 

2 

H 

0 

B 

1 

(  90 

0 

0) 

(3)  The  data  for  shape  1  illustrates  the  definition  of  a  rotation  relation 
immediately  following  the.  definition  of  the  shape  concerned: 


I 

'i 


S  0 

F  3 

0  0  1 

6  0-1 
12  -5  -1 

H  0 

B  1  (  90  2  9) 

0  0 
F  2 

0  0  2 

12  -5  -1 

H  0 

B  1  (  180  -2  4) 


C 

H 

3 


CH 

3 


(4)  The  data  for  shapes  231 ,  279  and  280  illustrate  how  common  data  for  similar 
shapes  need  only  be  defined  once.  These  shapes  share  the  same  line  and  string  data  and 
only  differ  in  the  position  of  the  bonds: 


o 

C-I 

o 
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231  0 

S  12 


( 

12 

-7)( 

0  - 

■16)  ( 

-15  -9)  (  -15 

9)  ( 

0 

16)  ( 

12  7) (-501  -3) 

( 

-7 

-4)( 

500  - 

■16)  ( 

10  -6) (  514 

8)( 

0 

12)  ( 

F 

1 

-5 

-2 

1 

N 

H 

0 

B 

2 

( 

180 

-18 

-7)  ( 

0 

12  -7) 

279 

0 

B 

2 

( 

180 

-18 

-23)  ( 

0 

12  -7) 

280 

0 

B 

2 

( 

180 

-18 

-7)  ( 

0 

12  -23) 

(5) 

The 

data 

for 

shapes 

466 

and  416  illustrate  how 

one 

shape 

can  be  defined  as  the 

rotation  relation  of  another  shape: 


466  416 
S  0 

F  7 


H  0 

B  2 


416  466 
B  2 


5  0 

0  0 
1 1  0 
17  -5 

22  0 
26  -6 
32  -6 


90  7 


C 

( 

F 

2 

) 

1 

6 


9) (  270  7  -2) ( 


(  0  22  4)(  180  3  4) 
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Each  structure  in  the  structure  definition  file  is  defined  by  a  text  descriptor 
record  and  by  one  or  more  chain  records.  Each  record  can  be  up  to  80  characters  in 
length. 

Only  the  first  35  characters  of  the  text  descriptor  record  are  used  by  the  program. 

If  less  than  35  characters  are  supplied  by  the  user,  then  the  number  is  made  up  to  35  by 
the  addition  of  spaces  after  the  user  suppTied  text.  The  first  character  of  the  text 
descriptor  record  must  not  be  a  hyphen  so  that  it  can  be  distinguished  from  a  chain  record. 

Each  chain  record  consists  of  a  series  of  shape  numbers  separated  by  hyphens.  The 
first  character  is  a  hyphen  and  the  record  contains  no  spaces.  Whenever  the  shape  is  a 
link  shape,  an  asterisk  is  included  after  the  shape  number  and  the  number  of  asterisks 
indicates  how  many  chains  are  to  be  connected  to  that  link  shape. 

One  chain  description  can  extend  over  several  consecutive  records.  A  continuation 
request  is  indicated  by  terminating  the  record  with  a  hyphen  rather  than  a  shape  number. 

The  following  record  is  then  considered  as  a  continuation  of  the  current  record  and  no 
initial  hyphen  or  shape  number  is  required.  For  example,  the  record 


- 1 84-49-49-49 


is  equivalent  to 


- 1 84-49- 
49-49 


The  following  paragraphs  contain  examples  of  structure  definition  data. 

(1)  In  order  to  check  the  visual  validity  of  the  shape  definition  data,  it  is  necessary: 

(a)  to  display  each  shape  in  its  normal  orientation; 

(b)  to  display  each  shape  rotated  through  90°,  180°  and  270°; 

(.c)  to  display  each  shape  rotated  through  intermediate  angles 

ie 

45°,  135°,  225°  and  315°. 

The  dummy  shape  500  is  used  as  a  base  shape  to  force  the  shape  under  consideration  to  be 
rotated  (see  Appendix  A.  3,  Ex;imple  1).  The  data  listed  below  could  be  used  as  a 
structure  definition  file  to  verify  the  shape  data  for  shapes  63  and  64  which  is  given 
in  Appendix  A,  Example  I.  The  output  would  be  in  the  form  of  six  drawings,  three  with 
title  'SHAPE  63'  and  three  with  title  'SHAPE  64',  and  this  is  shown  in  Fig  7. 


The  structure  definition  data  is  as  follows: 
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SHAPE  63 
-63 

SHAPE  63 
-500**** 
-500-63 
-500-63 
-500-63 
-500-63 
SHAPE  63 
-500******** 
-500 
-500 
-500 
-500 
-500-63 
-500-63 
-500-63 
-500-63 
SHAPE  64 
-64 

SHAPE  64 
-500**** 
-500-64 
-500-64 
-500-64 
-500-64 
SHAPE  64 

-500 

-500 

-500 

-500 

-500-64 

-500-64 

-500-64 

-500-64 


(2)  The  data  to  produce  Fig  8  is  as  follows: 


GROUP  1663 
-100* 

-100-I05*-7 

-105-7 


GROUP 

1669 

-100-102-5! 

GROUP 

1670 

-89-51- 

102 

GROUP 

1671 

-100* 

—  1 00— 1 05*— 4  5 

-105-45 

GROUP 

1673 

-89-6-6 

GROUP 

1676 

-64-102 

-31 
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(3) 


The  data  to  produce  Fig  II  is  as  follows: 


POLYMER  162-  162-  162 

-I  5 1 **-l 5 l**-l 51  ** 

-151-109 

-151-109 

-151-109 

-151-109 

-151-109 

-151-109 

POLYMER  162-  162-  44! 

— 151**— 151**— 151** 

-151-109 

-151-109 

-151-109 

-151-109 

-151-109 

-151-125 

POLYMER  162-  162-  824 

- 1 5 1 **- 1 5 1 **- 151** 

-151-109 

-151-109 

-151-109 

-151-109 

-151-124 

-151-124 

POLYMER  162-  162-  637 

—  1 51**— 1 51**— 1 5 1** 

-151-109 

-151-109 

-151-109 

-151-109 

-151-125 

-151-125 

POLYMER  162-  441-  441 

— 151**— 151**— 1 51** 

-151-109 

-151-109 

-151-109 

-151-125 

-151-109 

-151-125 

POLYMER  162-  441-  824 

— 151**— 151 **— 1 51** 

-151-109 

-151-109 

-151-109 

-151-125 

-151-124 

-151-124 
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FEATURES  OF  THE  GRAPHICS  SOFTWARE 

The  FORTRAN  programs  make  use  of  the  following  features  of  the  graphics  software 
package,  GPGS: 

(a)  the  screen  is  cleared  by  a  call  to  the  routine  CLRDEV; 

(b)  picture  entities  are  declared  between  calls  to  the  routines  BGNPIC  and 

ENDPIC  and  each  picture  is  given  a  unique  integer  identifier; 

(c)  lines  are  drawn  by  a  routine  LINE  with  parameters: 

X-coordinate 
Y-coordinate 
Pen  state  (up  or  down) 

(d)  an  integer  identifier  can  be  associated  with  a  line,  a  collection  of  lines  or 
with  a  string  of  characters  which  are  declared  between  calls  to  the  routines  BGNNAM 

and  ENDNAM; 

(e)  a  routine  INWAIT  will  wait  for  a  user  response  on  a  given  range  of  devices,  the 
keyboard  or  light  pen  in  this  case,  and  it  provides  the  following  information  about  the 
response:  (i)  an  integer  identifier  specifying  which  device  responded; 

(ii)  the  data  provided  by  the  device,  ie  for  a  light  pen,  the  picture  identifier 
and  the  line/character  string  identifier  for  the  item  selected;  or  for  the 
keyboard,  the  characters  typed  by  the  user; 

(f)  the  limits  for  the  drawing  space  are  declared  in  a  call  to  the  routine  WINDW 
and  this  coordinate  space  is  mapped  onto  the  screen  area.  All  parameters  subsequently 
passed  to  the  line  drawing  routines  must  be  expressed  in  terms  of  coordinates,  within 
the  declared  drawing  space. 
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A  DESCRIPTION'  OF  THE  STRl'CTURE  DISPLAY  PROGRAM  IN'  EXECUTION 

This  Appendix  contains  a  description  of  the  sequential  operations  performed  by  the 
structure  display  program  in  displaying  a  chemical  structure.  The  principal  routines  in 
the  program  and  their  functions  are: 

GETSH  (GET  SHape)  obtains  the  next  shape  number; 

TOSTAR  (TO  STARt  position)  matches  bonds  and  calculates  the  shape  origin  and  rotation; 
DRAWSH  (DRAW  SHape)  draws  the  shape. 

The  program  first  enters  an  initialisation  phase  as  described  in  section  8.1  in 
order  to  establish  the  working  environment.  It  then  opens  the  random  access  file  RANDAT 
containing  the  shape  definitions  and  reads  into  memory  the  shape  index,  which  defines 
where  each  shape  description  is  stored  within  the  file. 

The  routine  GETSH  is  entered  to  obtain  from  the  structure  description  data  the 
number  of  the  first  shape  to  be  drawn  and  the  number  of  asterisks  associated  with  the 
shape  number,  ie  it  remembers  if  the  shape  is  a  link  shape  and  how  many  chains  are  to  be 
attached  to  it.  Finally  it  reads  the  shape  description  for  the  shape  number. 

The  routine  TOSTAR  is  entered  to  establish  whether  the  first  shape  can  be  displayed 
without  rotation.  The  shape  is  examined  for  bonds  at  0°  and  then  for  bonds  at  180°,  270° 
and  90°  successively.  If  such  a  bond  is  found,  the  shape  can  be  drawn  in  its  normal 
orientation  and  this  bond  will  i  ■  used  as  the  exit  bond  to  connect  to  the  next  shape. 

Thus  the  order  of  priorities  in  asserting  the  original  direction  of  drawing  are: 

right  horizontal 
left  horizontal 
down  vertical 
up  vertical. 

If  no  horizontal  or  vertical  bond  is  found,  the  normal  shape  connection  method  (see 
section  f>)  is  used  to  select  an  exit  bond  to  match  with  a  fictitious  180°  bond  and  the 
shape  is  thus  rotated  to  transform  the  selected  bond  into  a  bond  at  0°,  This  forces  the 
first  chain  to  start  in  a  right,  horizontal  direction.  A  bond  opposite  the  selected  exit 
bond  is  taken  as  the  assumed  entry  bond.  If  the  overlap  check  has  been  requested,  TOSTAR 
calls  tlie  routine  DRAWSH  in  'non-draw*  mode  to  establish  the  maximum  X  and  Y  coordinates 
of  the  first  shape. 

The  routine  DRAWSH  is  entered  in  'draw'  mode  to  draw  the  shape  with  any  required 
rotation.  It  takes  the  current  shape  description  and  draws  the  appropriate  lines,  full 
sized  characters  and  half  sized  characters  relative  to  the  current  origin,  transforming 
all  coordinate  positions  according  to  the  current  rotation.  It  then  adds  bonds  of  a 
standard  length  at  the  transformed  coordinate  positions  and  at  the  transformed  angles,  and 
each  bond  is  given  a  unique  light  pen  identity,  which  is  for  use  in  the  interaction  phase. 
Having  drawn  the  shape  the  processing  of  the  first  shape  is  completed. 

The  routine  GETSH  is  entered  again.  If  the  shape  just  drawn  was  a  link  shape  then 
GETSH  updates  the  'future  link  shape’  stack  with 
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(a)  its  shape  number 

(b)  the  X  and  Y  coordinates  of  the  current  origin 

(c)  the  current  rotation  which  has  been  applied  to  the  shape 

(d)  the  number  of  link  chains  to  be  processed 

(e)  the  tight  pen  identity  given  to  the  first  bond  of  the  shape 

(f)  all  the  bond  angles  for  the  shape  except 

(i)  that  bond  used  for  entry  to  the  shape,  or,  for  the  first  shape  of  a 
structure,  the  bond  which  has  been  assumed  to  be  the  entry  bond; 

(ii)  one  bond  opposite  (180°  out  of  phase  with)  the  entry  bond  or  assumed  entry 
bond,  where  the  shape  just  drawn  was  the  last  shape  in  a  chain. 

The  stack  operates  on  a  'first  in  -  first  out'  basis,  and  all  operations  on  the  stack  are 
performed  by  GETSH.  In  addition,  a  flag  is  set  to  indicate  that  the  next  exit/entry  con¬ 
nection  to  be  made  will  involve  a  link  shape  and  so  the  'future  link  shape'  stack  of 
available  bonds  must  be  updated  on  the  next  entry  to  GETSH  to  delete  the  exit  bond. 

GETSH  next  repeats  the  process  whereby  it  isolates  the  next  shape  number,  records  the 
number  of  asterisks  and  reads  the  shape  description  into  the  current  shape  description 
area.  The  program  now  has  one  shape  completely  drawn  and  all  the  information  necessary 
to  draw  the  next. 

The  routine  TOSTAR  decides  how  the  two  shapes  should  be  connected  together  and  it 
implements  the  rules  described  in  section  6.  Having  established  which  bonds  are  to  be 
linked  together,  it  takes  the  origin  position  for  the  previous  shape  and  any  rotation 
associated  with  it  and  it  calculates  a  new  origin  and  a  new  rotation  so  that  when  the 
shape  is  drawn  with  these  parameters,  the  required  bonds  will  connect.  If  a  rotation 
of  +90°  or  -90°  is  required  and  a  rotation  relation  exists,  the  current  shape  definition 
is  overwritten  with  that  of  the  rotation  relation  and  the  whole  matching  process  is 
repeated.  If  the  interference  check  has  been  requested,  the  routine  GETLEN  is  entered 
to  find  the  rectangle  containing  the  current  shape  and  to  check  for  overlaps  between 
this  rectangle  and  the  rectangle  containing  the  previous  shape.  If  the  rectangles 
overlap,  the  entry  bond  is  first  shortened  to  zero,  and  then  lengthened  up  to  14  times 
the  standard  bond  length  in  an  attempt  to  avoid  the  overlap. 

The  routine  DRAWSH  draws  the  shape  as  before. 

This  whole  process  is  repeated  until  GETSH  completes  the  processing  of  a  chain 
record  and  any  associated  continuation  records.  At  this  stage,  if  the  'future  link  shape' 
stack  is  empty  then  the  structure  is  fully  drawn.  If  it  is  not  empty,  the  information 
on  the  top  of  the  stack  is  accessed  to  restore  the  program  state  to  that  current  when  the 
link  shape  was  first  drawn,  and  the  link  shape  flag  is  set  so  that  after  TOSTAR  has 
selected  an  exit  bond  for  the  link  shape  and  on  the  next  entry  to  GETSH  this  bond  will  be 
deleted  from  those  stored  on  the  'future  link  shape'  stack.  On  the  same  pass,  the  number 
of  link  chains  to  be  processed  is  decremented.  A  link  shape  stays  on  top  of  the  stack 
until  this  count  reaches  zero,  when  the  following  item  then  becomes  the  new  'top  of  stack' 
and  indicates  the  next  link  shape  to  be  used.  Having  reverted  to  an  earlier  link  shape 
as  the  'previous'  shape,  the  drawing  logic  is  then  repeated  as  before. 


030 


Appendix  D 


31 


When  a  structure  has  been  drawn  completely,  the  program  calculates  the  offset 
which  must  be  applied  to  centralise  the  rectangle  containing  the  drawing  and  this  offset 
is  applied  if  the  structure  is  displayed  again  during  the  interaction  phase  or  if  a  copy 
file  is  produced. 

After  the  required  array  of  structures  has  been  drawn,  the  interaction  phase  is 
entered.  A  light  pen  selection  of  a  bond  indicates  that  the  user  wishes  to  change  the 
length  of  that  bond.  The  graphics  display  driver  returns  the  unique  light  pen  identity 
of  the  bond  selected  and  the  program  appends  it  to  a  list,  together  with  the  length  of 
the  bond  required,  which  is  read  from  the  Vector  General  keyboard.  This  list  is  accessed 
by  DRAWSH.  As  it  draws  each  bond  it  checks  whether  this  bond  has  been  selected  during 
this  program  run  and  if  so,  it  uses  the  requested  bond  length  stored  in  the  list;  other¬ 
wise  it  uses  the  standard  bond  length.  The  list  is  also  used  by  TOSTAR  when  reverting 
to  a  previously  stored  link  shape  so  that  it  can  calculate  the  correct  shift  of  origin  to 
connect  two  selective  bonds  together.  Following  each  light  pen  interaction,  the  whole  array  of 
structures  is  redrawn  and  this  continues  until  the  user  responds  with  a  RETURN  on  the 
Vector  General  keyboard  to  indicate  that  the  drawing  is  now  acceptable.  Any  light  pen 
selection  of  a  menu  item  is  also  processed  at  this  time. 

If  a  copy  file  has  been  requested  the  program  makes  a  further  pass  through  the  data 
to  produce  the  final,  centred  drawing  data  which  can  later  be  processed  to  produce  draw¬ 
ings  on  a  hard  copy  device.  After  the  copy  has  beer,  produced  the  program  either  loops 
to  draw  the  next  array  or  terminates  if  the  structure  description  file  is  exhausted. 

Because  the  program  may  make  more  than  one  pass  through  the  structure  description 
data  and  it  may  make  many  identical  accesses  to  the  shape  description  data,  considerable 
buffering  of  data  takes  place.  The  structure  description  data  for  the  complete  array  of 
structures  is  maintained  within  the  program  and  it  is  only  read  from  the  file  on  the 
first  pass.  The  shape  description  data  is  also  buffered,  but  in  a  cyclic  buffer.  If  a 
shape  needs  to  be  drawn  and  the  description  has  already  been  read  into  the  buffer,  the 
shape  data  is  advanced  round  the  buffer  in  order  to  delay  the  time  when  it  is  overwritten 
by  a  new,  incoming  shape  description.  The  descriptions  of  shapes  which  are  in  current 
common  usage  thus  tend  to  stay  in  the  buffer  and  are  immediately  available. 

The  logic  of  the  program  is  shown  in  flowchart  form  in  Appendix  E. 
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GETSH  (Part  1):  Routine  to  isolate  the  next  shape  number 
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DELETE  MI  XT  FREE 
IQMD  FROM  STACKED 
LIMK  SHARE 


GETSH  (Part  2):  Routine  to  isolate  the  next  shape  number 
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MATCH:  Routine  to  select  the  bonds  of  the  current  and  previous  shapes  to  be  connected 


EXECUTION  EXCEPTION  CONDITIONS 


F. 1  The  shape  definition  program 

The  shape  definition  program  checks  the  data  as  it  is  read  in  and  whenever  the 
definition  rules  are  violated  an  error  message  is  printed  out.  The  messages  take  the 
following  form 

PPP  NNN  :  Message 


where  PPP  is  the  previous  shape  number 

'NNN'  is  the  current  shape  number,  or  0  for  the  rotation  relation  of  PPP, 
and  'Message'  indicates  what  type  of  discrepancy  has  been  discovered. 

After  printing  the  message,  the  program  continues  as  if  no  error  had  been  found. 

The  possible  messages  and  their  causes  are: 


(a)  INVALID  SHAPE  NUMBER 

(b)  MORE  THAN  83  LINES 

(c)  MORE  THAN  15  FULL  SIZED  STRINGS  : 

(d)  INVALID  FULL  SIZED  STRING  LENGTH: 

(e)  MORE  THAN  2  HALF  SIZED  STRINGS  : 

(f)  INVALID  HALF  SIZED  STRING  LENGTH: 

(g)  INVALID  NUMBER  OF  BONDS 

(h)  MTH  BOND  ANGLE  INVALID:RRR  : 

(i)  TOO  MANY  SHAPES  : 

(j)  INVALID  FIRST  CHARACTER  : 

F . 2  The  structure  display  program 

The  program  recognises  the  following  f 
produces  a  message  where  appropriate: 


shape  number  greater  than  500 

the  maximum  number  of  lines  has  been 
exceeded 

the  maximum  number  of  full  sized  strings 
has  been  exceeded 

a  full  sized  string  has  been  defined  with 
no  characters  or  with  more  than  5  characters 

the  maximum  number  of  half  sized  text 
strings  has  been  exceeded 

a  half  sized  string  has  been  defined  with 
no  characters  or  with  more  than  15  characters 

the  maximum  number  of  bonds  has  been 
exceeded 

bond  angle  number  M  has  value  RRR  which 
is  not  a  multiple  of  15° 

the  maximum  number  of  shapes  has  been 
exceeded 

the  first  character  of  a  record  is  not  'V', 
'F',  'S',  'H'  or  'B'.  The  character  is 
assumed  to  be  a  space 


s  in  processing  the  input  data  and 


(a)  Blank  data  records  in  the  structure  definition  file  are  ignored. 


(b)  PAUSE: UNKNOWN  CHARACTER:  The  program  has 
when  processing  a  chain  description  record,  ie  not  " 
is  'ignored  if  the  program  is  restarted. 


encountered  an  unexpcc 
0"  to  "9",  or  '*'  . 


ted  character 
The  character 


(c)  PAUSE:  NUMBER  UNTERMINATED:  A  field  of  more  than  '  characters  has  been  found 
when  analysing  the  next  shape  number.  This  field  width  includes  digits  and  '*'s.  The 
number  is  assumed  to  be  terminated  by  the  15th  character  if  the  program  is  restarted. 
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(d)  NO  BOND  MATCH  POSSIBLE:  It  is  not  possible  to  link  the  current  shape  to  the 
previous  shape  because 

(i)  a  non-link  shape  has  only  one  bond  and  so  no  bond  is  available  as  an 
exit  bond; 

(ii)  a  link  shape  has  no  free  bonds  left  to  be  used  as  an  exit  bond. 

A  blinking  '#  #  '  is  displayed  on  the  Vector  General  screen  as  a  further  warning  to  the 
user. 

(e)  ***SHAPE  NOT  DEFINED  NNN***:  This  message  is  displayed  when  a  shape  requested 
in  the  structure  definition  file  has  not  been  defined  in  the  shape  definition  file. 

The  invalid  shape  number  is  ignored  and,  arbitrarily,  shape  500  is  used  instead. 

(f)  TOO  MANY  BONDS  ALTERED:  The  user  has  attempted  to  define  the  length  of  more 

bonds  than  the  program  can  store.  The  program  ignores  the  current  request  to  change  a 

bond  length. 

(g)  TOO  MANY  STRUCTURE  DEFINITION  RECORDS:  The  number  of  records  or  the  number  of 
characters  defining  the  current  array  of  structures  has  exceeded  the  maximum  which  the 
program  can  store.  Program  execution  is  terminated. 

(h)  TOO  MANY  LINK  SHAPES  STACKED:  The  number  of  link  shapes  stored  on  the  stack 

for  future  use  has  exceeded  the  maximum  size  of  the  stack.  Program  execution  is 

terminated. 

(i)  NO.  OF  OVERLAP  BOXES  EXCEEDED:  The  number  of  shapes  in  the  current  structure 
has  exceeded  the  limit  which  can  be  handled  by  the  overlap  check.  Execution  is  continued 
and  the  limits  of  the  first  rectangle  are  overwritten  by  the  limits  for  the  current  shape. 
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Table  1 

ORDER  OF  SEARCH  FOR  A  MATCH  BETWEEN  THE  BONDS  OF  A  LINK  SHAPE  AND  THE  CURRENT  SHAPE 


Link  bond  1 

Link  t 

ond  2 

Link  1 

>ond  3 

Current 

Current 

Current 

Current 

Current 

Current 

bond  1 

bond  2 

bond  1 

bond  2 

bond  1 

bond  2 

PHASE  DIFFERENCE 

deg 

180 

1 

2 

9 

10 

17 

18 

0 

3 

4 

1 1 

12 

19 

20 

90 

5 

6 

13 

14 

21 

22 

270 

7 

8 

15 

16 

23 

24 

210 

25 

2 

6 

33 

34 

41 

42 

30 

27 

2 

8 

35 

36 

43 

44 

120 

29 

3 

0 

37 

38 

45 

46 

300 

31 

3 

2 

39 

40 

47 

48 

225 

49 

5 

0 

57 

58 

65 

66 

45 

51 

5 

2 

59 

60 

67 

68 

135 

53 

5 

4 

61 

62 

69 

70 

315 

55 

5 

6 

63 

64 

7. 

72 

240 

73 

7 

4 

81 

82 

89 

90 

60 

75 

76 

83 

84 

91 

92 

150 

77 

78 

85 

86 

93 

94 

330 

79 

80 

87 

88 

95 

96 
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Fig  la-e  A  shape  and  its  graphical  constituents 


Fig  2a-c 


a  Shapes  to  be  connected 
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b  Shapes  after  rotation  to  match  bond  angles 
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c  Shapes  after  translation  to  match  bond  positions 


Fig  2a-c  Rotation  and  trantlation  of  shape*  in  order  to  connect  them  togetne. 
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Fig  4a-c  Problems  of  shape  rotation 
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Fig  5  Problem*  of  rotating  lubscript  text 


Fig  7  Shapes  diiplayed  in  various  oriantations 


Fig  8  Example*  of  group  drawings 
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Fig  9  Examples  of  group  drawings 


Fig  10  Examples  of  group  drawings 


Fig  1 1  Examples  of  polymer  drawings 
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Fig  13  Examples  of  polymer  drawings 
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